12 research outputs found

    Decentralized Learning in Online Queuing Systems

    Full text link
    Motivated by packet routing in computer networks, online queuing systems are composed of queues receiving packets at different rates. Repeatedly, they send packets to servers, each of them treating only at most one packet at a time. In the centralized case, the number of accumulated packets remains bounded (i.e., the system is \textit{stable}) as long as the ratio between service rates and arrival rates is larger than 11. In the decentralized case, individual no-regret strategies ensures stability when this ratio is larger than 22. Yet, myopically minimizing regret disregards the long term effects due to the carryover of packets to further rounds. On the other hand, minimizing long term costs leads to stable Nash equilibria as soon as the ratio exceeds ee1\frac{e}{e-1}. Stability with decentralized learning strategies with a ratio below 22 was a major remaining question. We first argue that for ratios up to 22, cooperation is required for stability of learning strategies, as selfish minimization of policy regret, a \textit{patient} notion of regret, might indeed still be unstable in this case. We therefore consider cooperative queues and propose the first learning decentralized algorithm guaranteeing stability of the system as long as the ratio of rates is larger than 11, thus reaching performances comparable to centralized strategies.Comment: NeurIPS 2021 camera read

    Online Matching in Geometric Random Graphs

    Full text link
    We investigate online maximum cardinality matching, a central problem in ad allocation. In this problem, users are revealed sequentially, and each new user can be paired with any previously unmatched campaign that it is compatible with. Despite the limited theoretical guarantees, the greedy algorithm, which matches incoming users with any available campaign, exhibits outstanding performance in practice. Some theoretical support for this practical success was established in specific classes of graphs, where the connections between different vertices lack strong correlations - an assumption not always valid. To bridge this gap, we focus on the following model: both users and campaigns are represented as points uniformly distributed in the interval [0,1][0,1], and a user is eligible to be paired with a campaign if they are similar enough, i.e. the distance between their respective points is less than c/Nc/N, with c>0c>0 a model parameter. As a benchmark, we determine the size of the optimal offline matching in these bipartite random geometric graphs. In the online setting and investigate the number of matches made by the online algorithm closest, which greedily pairs incoming points with their nearest available neighbors. We demonstrate that the algorithm's performance can be compared to its fluid limit, which is characterized as the solution to a specific partial differential equation (PDE). From this PDE solution, we can compute the competitive ratio of closest, and our computations reveal that it remains significantly better than its worst-case guarantee. This model turns out to be related to the online minimum cost matching problem, and we can extend the results to refine certain findings in that area of research. Specifically, we determine the exact asymptotic cost of closest in the ϵ\epsilon-excess regime, providing a more accurate estimate than the previously known loose upper bound

    Pure exploration and regret minimization in matching bandits

    Get PDF
    Finding an optimal matching in a weighted graph is a standard combinatorial problem. We consider its semi-bandit version where either a pair or a full matching is sampled sequentially. We prove that it is possible to leverage a rank-1 assumption on the adjacency matrix to reduce the sample complexity and the regret of off-the-shelf algorithms up to reaching a linear dependency in the number of vertices (up to poly log terms)

    Apprentissage et Algorithmes pour le Matching Séquentiel

    No full text
    This thesis focuses mainly on online matching problems, where sets of resources are sequentially allocated to demand streams. We treat them both from an online learning and a competitive analysis perspective, always in the case when the input is stochastic.On the online learning side, we study how the specific matching structure influences learning in the first part, then how carry-over effects in the system affect its performance.On the competitive analysis side, we study the online matching problem in specific classes of random graphs, in an effort to move away from worst-case analysis.Finally, we explore how learning can be leveraged in the scheduling problem.Cette thèse se concentre principalement sur les problèmes d'appariement en ligne, où des ensembles de ressources sont alloués séquentiellement à des flux de demandes. Nous les traitons à la fois du point de vue de l'apprentissage en ligne et de l'analyse compétitive, toujours lorsqueEn ce qui concerne l'apprentissage en ligne, nous étudions comment la structure spécifique de l'appariement influence l'apprentissage dans la première partie, puis comment les effets de report dans le système affectent ses performances.En ce qui concerne l'analyse compétitive, nous étudions le problème de l'appariement en ligne dans des classes spécifiques de graphes aléatoires, dans un effort pour s'éloigner de l'analyse du pire cas.Enfin, nous explorons la manière dont l'apprentissage peut être exploité dans le problème d'ordonnancement des machines

    Apprentissage et Algorithmes pour le Matching Séquentiel

    No full text
    Cette thèse se concentre principalement sur les problèmes d'appariement en ligne, où des ensembles de ressources sont alloués séquentiellement à des flux de demandes. Nous les traitons à la fois du point de vue de l'apprentissage en ligne et de l'analyse compétitive, toujours lorsqueEn ce qui concerne l'apprentissage en ligne, nous étudions comment la structure spécifique de l'appariement influence l'apprentissage dans la première partie, puis comment les effets de report dans le système affectent ses performances.En ce qui concerne l'analyse compétitive, nous étudions le problème de l'appariement en ligne dans des classes spécifiques de graphes aléatoires, dans un effort pour s'éloigner de l'analyse du pire cas.Enfin, nous explorons la manière dont l'apprentissage peut être exploité dans le problème d'ordonnancement des machines.This thesis focuses mainly on online matching problems, where sets of resources are sequentially allocated to demand streams. We treat them both from an online learning and a competitive analysis perspective, always in the case when the input is stochastic.On the online learning side, we study how the specific matching structure influences learning in the first part, then how carry-over effects in the system affect its performance.On the competitive analysis side, we study the online matching problem in specific classes of random graphs, in an effort to move away from worst-case analysis.Finally, we explore how learning can be leveraged in the scheduling problem

    Apprentissage et Algorithmes pour le Matching Séquentiel

    No full text
    This thesis focuses mainly on online matching problems, where sets of resources are sequentially allocated to demand streams. We treat them both from an online learning and a competitive analysis perspective, always in the case when the input is stochastic.On the online learning side, we study how the specific matching structure influences learning in the first part, then how carry-over effects in the system affect its performance.On the competitive analysis side, we study the online matching problem in specific classes of random graphs, in an effort to move away from worst-case analysis.Finally, we explore how learning can be leveraged in the scheduling problem.Cette thèse se concentre principalement sur les problèmes d'appariement en ligne, où des ensembles de ressources sont alloués séquentiellement à des flux de demandes. Nous les traitons à la fois du point de vue de l'apprentissage en ligne et de l'analyse compétitive, toujours lorsqueEn ce qui concerne l'apprentissage en ligne, nous étudions comment la structure spécifique de l'appariement influence l'apprentissage dans la première partie, puis comment les effets de report dans le système affectent ses performances.En ce qui concerne l'analyse compétitive, nous étudions le problème de l'appariement en ligne dans des classes spécifiques de graphes aléatoires, dans un effort pour s'éloigner de l'analyse du pire cas.Enfin, nous explorons la manière dont l'apprentissage peut être exploité dans le problème d'ordonnancement des machines

    Static Scheduling with Predictions Learned through Efficient Exploration

    No full text
    A popular approach to go beyond the worst-case analysis of online algorithms is to assume the existence of predictions that can be leveraged to improve performances. Those predictions are usually given by some external sources that cannot be fully trusted. Instead, we argue that trustful predictions can be built by algorithms, while they run. We investigate this idea in the illustrative context of static scheduling with exponential job sizes. Indeed, we prove that algorithms agnostic to this structure do not perform better than in the worst case. In contrast, when the expected job sizes are known, we show that the best algorithm using this information, called Follow-The-Perfect-Prediction (FTPP), exhibits much better performances. Then, we introduce two adaptive explore-then-commit types of algorithms: they both first (partially) learn expected job sizes and then follow FTPP once their self-predictions are confident enough. On the one hand, ETCU explores in "series", by completing jobs sequentially to acquire information. On the other hand, ETCRR, inspired by the optimal worst-case algorithm Round-Robin (RR), explores efficiently in "parallel". We prove that both of them asymptotically reach the performances of FTPP, with a faster rate for ETCRR. Those findings are empirically evaluated on synthetic data

    Online Matching in Geometric Random Graphs

    No full text
    We investigate online maximum cardinality matching, a central problem in ad allocation. In this problem, users are revealed sequentially, and each new user can be paired with any previously unmatched campaign that it is compatible with. Despite the limited theoretical guarantees, the greedy algorithm, which matches incoming users with any available campaign, exhibits outstanding performance in practice. Some theoretical support for this practical success has been established in specific classes of graphs, where the connections between different vertices lack strong correlations-an assumption not always valid in real-world situations. To bridge this gap, we focus on the following model: both users and campaigns are represented as points uniformly distributed in the interval [0, 1], and a user is eligible to be paired with a campaign if they are "similar enough," meaning the distance between their respective points is less than c/N , where c > 0 is a model parameter. As a benchmark, we determine the size of the optimal offline matching in these bipartite random geometric graphs. We achieve this by introducing an algorithm that constructs the optimal matching and analyzing it. We then turn to the online setting and investigate the number of matches made by the online algorithm CLOSEST, which pairs incoming points with their nearest available neighbors in a greedy manner. We demonstrate that the algorithm's performance can be compared to its fluid limit, which is completely characterized as the solution to a specific partial differential equation (PDE). From this PDE solution, we can compute the competitive ratio of CLOSEST, and our computations reveal that it remains significantly better than its worst-case guarantee. This model turns out to be closely related to the online minimum cost matching problem, and we can extend the results obtained here to refine certain findings in that area of research. Specifically, we determine the exact asymptotic cost of CLOSEST in the ϵ-excess regime, providing a more accurate estimate than the previously known loose upper bound
    corecore